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INTRODUCTION 

The  subject  of  the  research  in  this  proposal  is  to  develop  methods  for  the  examination  of  mo¬ 
lecular  alterations  in  prostate  cancer  at  the  level  of  single  cells.  The  purpose  of  the  research  is  to 
use  these  methods  to  identify  molecular  alterations  in  prostate  cancer  cells  that  can  be  used  either 
singly  or  in  combination  to  provide  insights  into  the  molecular  evolution  of  prostate  carcino¬ 
genesis,  and  produce  a  set  of  molecular  tools  capable  of  influencing  the  clinical  management  of 
patients  with  prostate  carcinoma.  The  scope  to  the  research  involves  the  construction  of  cDNA 
libraries  representing  the  genes  expressed  in  selected  populations  of  normal  and  neoplastic  pros¬ 
tate  cancer  cells  followed  by  the  construction  of  microarrays  suitable  for  comprehensive  gene 
expression  studies.  These  arrays  will  then  be  used  to  evaluate  methods  for  single-cell  transcrip- 
tome  amplification  with  the  aim  of  identifying  a  cohort  of  cellular  transcripts  which  reflect  a 
cellular  phenotype. 


BODY 

Technical  objective  1:  To  obtain  defined  populations  of  normal  and  neoplastic  prostate  cell 

types  which  retain  in-situ  cellular  characteristics 

•  Task  1:  obtain  and  pathologically  characterize  fresh  samples  of  normal,  primary 
neoplastic,  and  metastatic  carcinoma.  Prepare  tissue  sections  in  frozen  and  fixed 
formats.  Perform  immunohistochemistry.  Completed. 

•  Task  2:  purify  normal  luminal,  normal  basal,  and  primary  carcinoma  cell  popula¬ 
tions  using  flow  cytometric  sorting.  Disaggregate  tissues,  immuno-label,  sort,  assess 
sorted  populations  for  purity  via  microscopic  examination  and  by  PCR  analysis.  Sort 
single  cells  into  microtiter  format.  We  have  sorted  and  purified  normal  basal  and  lu¬ 
minal  cells  by  flow  cytometry  and  constructed  a  cDNA  library  from  each  population 
(described  in  the  previous  report).  We  have  sorted  primary  carcinoma  cell  popula¬ 
tions  (manuscript  in  preparation;  Liu  et  al).  Isolation  of  RNA  from  the  purified  cell 
populations  has  been  inconsistent  in  terms  of  quality  and  quantity.  The  work  is  on¬ 
going  to  optimize  the  methods  using  alternative  RNA  preservation  reagents  (e.g. 
RNAlater). 

•  Task  3:  evaluate  alternative  tissue  digestion  protocols.  We  have  disaggregated  tissue 
samples  with  trypsin,  with  EDTA  alone,  and  with  Dispase  without  a  significant  im¬ 
provement  in  quality/quantity  of  RNA  extraction  compared  to  the  standard  colla- 
genase  protocol.  Gene  expression  alterations  resulting  from  the  dissagregation  pro¬ 
cedure  remain  a  major  hurdle  for  using  this  approach  with  flow  cytometry  as  a 
means  to  profile  gene  expression  from  solid  tissues. 

•  Task  4:  microdissect  cohorts  of  phenotypically  distinct  prostate  cells:  luminal  epi¬ 
thelium,  basal  epithelium,  PIN,  carcinoma  foci,  metastatic  foci.  We  have  employed  a 
new  approach  for  microdissection  that  uses  a  laser-capture  microscope  (Arcturus) 
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and  used  this  methodology  to  constuct  3  new  prostate  cDNA  libraries;  one  repre¬ 
senting  prostate  basal  cells;  one  representing  prostate  luminal  cells;  and  one  repre¬ 
senting  prostate  stromal  cells.  Following  the  development  of  protocols  aimed  at  op¬ 
timizing  both  laser  capture  microdissection  and  RNA  isolation,  24,000  cells  each  of 
stroma,  luminal  epithelium,  and  basal  epithelium  were  captured  and  the  RNA  iso¬ 
lated  by  spin-column  purification  methods.  cDNA  libraries  were  constructed  in  a  X- 
phage  vector  using  Clontech’s  SMART  cDNA-PCR  method.  The  respective  librar¬ 
ies  were  then  converted  into  phagemids  and  300-700  clones  from  each  library  were 
sequenced  for  initial  library  characterization  (Table  1).  Genes  specific  to  each  cell 
type  were  identified  including  PSA  from  the  luminal  cell  library,  PSCA  from  the  ba¬ 
sal  cell  library,  and  vimentin  from  the  stromal  cell  library.  Furthermore,  optimized 
microscopy  methods,  combining  both  laser  catapult  microdissection  and  laser  cap¬ 
ture  microdissection,  are  now  in  place  so  as  to  improve  capture  of  basal  epithelial 
cells:  cells  more  difficult  to  isolate  than  other  prostate  cell-types.  To  date,  there  are 
no  published  reports  characterizing  or  comparing  and  contrasting  the  gene  expres¬ 
sion  profiles  and/or  cDNA  library  construction  from  populations  of  prostate  stromal, 
luminal  epithelial,  or  basal  epithelial  cells.  A  manuscript  detailing  these  libraries  is 
in  preparation  (see  reportable  outcomes,  Moore  et  al).  We  anticipate  that  these  li¬ 
braries  will  be  useful  tools  for  a  variety  of  applications,  including  identifying  pros¬ 
tate-specific  genes,  cell  type-specific  genes  within  the  prostate,  and  in  differential 
gene  expression  analysis.  Library  construction  from  PIN,  primary  carcinoma,  and 
metastatic  carcinoma  are  in  progress. 

Table  1.  Sequence  analysis  of  cDNA  libraries  constructed  after  cell  isolation  by  Laser 

Capture  Microscopy. 


Luminal 

Epithelium 

Basal 

Epithelium 

Stroma 

No.  clones  sequenced 

768 

288 

741 

%  w/o  annotations 

27 

55 

28 

%  annotated 

73  (557) 

45 (130) 

72  (534) 

%  mitochondrial 

18 

34 

11 

%  ribosomal 

3 

10 

3.7 

•  Task  5:  microdissect  single  cells  (20)  from  each  of  the  above-described  phenotypes. 

While  we  have  been  able  to  consistently  isolate  single  cells  from  prostate  cancer  sections 
using  laser  capture  microdissection,  the  ability  to  amplify  the  amount  of  cDNA  needed  for 
use  in  cDNA  library  construction  or  cDNA  microarray  analysis  from  the  limited  amount  of 
RNA  available  in  single  cells  remains  challenging.  We  are  continuing  to  investigate  tech¬ 
niques,  including  amplification  of  mRNA  that  will  allow  us  to  develop  comparative  gene  ex¬ 
pression  profiles  from  individual  prostate  cells.  At  the  same  time,  we  have  made  progress  in 
developing  the  tools  necessary  for  the  analyses  of  molecular  changes  within  individual  pros¬ 
tate  epithelial  cells  isolated  from  peripheral  blood,  apheresis  samples,  and/or  bone  marrow. 
These  tools  include  immunostaining  cell  preparations  (epithelial  cells  isolated  from  periph- 
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eral  blood,  apheresis  samples,  or  bone  marrow  using  positive  and  negative  selection  meth¬ 
ods)  for  prostate  specific  antigen,  capturing  positively-stained  cells  by  laser  capture  micro¬ 
dissection,  cell  lysis,  and  DNA  analysis  for  methylation  of  GST-pi  and  androgen  receptor 
mutations.  We  have  successfully  amplified  and  sequenced  exon  8  of  the  androgen  receptor 
from  individual  circulating  prostate  cells  and  have  identified  no  mutations  in  a  cohort  of  5 
patients.  We  are  currently  developing  a  protocol  to  determine  GST-pi  methylation  status. 

•  Task  6:  assess  RNA  quality  ( preservation )  between  frozen  sections  and  fixed/stained  sections. 
As  anticipated,  our  work  in  this  area  has  demonstrated  that  the  yield  of  RNA  from  frozen  tis¬ 
sues  is  much  greater  and  of  higher  quality  than  from  comparable  quantities  of  formalin  fixed 
tissue.  Our  current  protocol  employs  a  rapid  ethanol  fixation  of  frozen  tissue  with  or  without 
an  H&E  or  immunostain  prior  to  LCM.  We  have  successfully  isolated  intact  RNA  from  for¬ 
malin-fixed  tissues,  but  to  date  this  remains  poorly  reproducible. 

•  Task  7:  assess  feasibility  of  flow  sorted  single  cell  isolation  automation.  We  are  not  currently 
pursuing  this  approach  due  to  the  alterations  in  gene  expression  resulting  from  tissue  dis- 
agreggation.  Future  work  may  entail  flow  cytometric  isolation  of  epithelial  cells  in  peripheral 
blood  or  bone  marrow. 

•  Task  8:  (future  work)  refine  cell  phenotype  acquisition  based  upon  the  development  of  new 
markers/antibodies.  In  collaboration  with  Dr.  Alvin  Liu  in  the  Department  of  Molecular 
Biotechnology,  we  have  identified  several  additional  antigens  recognized  with  monoclonal 
antibodies  that  can  be  used  for  sorting  prostate  epithelial  cells  by  flow  cytometry  (see  report- 
able  outcomes,  Liu  et  al).  The  future  application  of  these  discriminating  proteins/antigens 
will  await  the  development  of  consistent  amplification  protocols  as  described  in  this  pro¬ 
posal. 


Technical  objective  2:  To  construct  microarrays  of  prostate  transcripts  that  reflect  the  gene  ex¬ 
pression  potential  of  the  cell  types  to  be  examined. 

•  Task  8:  identify  a  non-redundant  clone  set  from  the  Prostate  Expression  Database  to 
encompass  all  highly  expressed  transcripts  (~  12),  moderately  expressed  transcripts 
(-500)  and  several  thousand  rare  transcripts  (-6000). 

We  have  now  identified  and  assembled  a  non-redundant  set  of  6,000  cDNAs  (ESTs) 
from  the  prostate  expression  database  that  are  suitable  for  array  construction.  Many 
of  these  genes  are  derived  from  the  cell  type-specific  libraries  described  above. 

•  Task  9:  retrieve  cDNA  clones  from  archive,  PCR  amplify  inserts  with  amine-linked 
primer,  and  purify.  We  have  retrieved  6,000  cDNA  clones  from  the  cDNA  archive 
and  amplified  the  inserts  by  PCR.  Our  current  array  construction  methodology  at  the 
Fred  Hutchinson  Cancer  Center  uses  poly-lysine  coated  slides  and  demonstrates  ex¬ 
cellent  reproducibility  and  sensitivity.  We  have  used  these  arrays  for  the  identifica¬ 
tion  of  genes  in  the  prostate  under  the  control  of  the  androgen  receptor  and  andro¬ 
genic  ligands  (See  reportable  outcomes,  Lin  et  al). 
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Task  10:  construct  3  normalized  cDNA  libraries  from  flow  sorted  basal,  luminal,  and 
primary  carcinoma  (CD44+)  without  amplification  procedures,  and  evaluate  li¬ 
braries  for  quality:  diversity  and  abundance  of  transcripts. 

As  described  in  the  previous  report,  we  have  constructed  cDNA  libraries  from  flow 
sorted  basal  (CD44+),  luminal  (CD57+),  and  primary  carcinoma  (CD44+)  cells.  We 
have  also  now  constructed  cDNA  libraries  from  normal  basal  and  luminal  epithelial 
cells  using  microdissection  approaches.  A  total  of  2,500  ESTs  have  been  produced 
from  these  libraries  and  entered  into  the  Prostate  Expression  Database 
twww.pedb.org). 

Task  11:  pick  random  cDNA  clones  from  the  new  libraries,  array  on  nylon  mem¬ 
branes  and  screen  for  abundant  prostate  cDNAs,  select  non-abundant  species,  PCR 
amplify  inserts.  Random  sequencing  of  cDNA  clones  from  the  libraries  described 
above  and  additional  libraries  from  prostate  cancer  cell  lines  has  identified  >18,000 
distinct  genes  expressed  in  prostate  tissues.  We  have  used  a  virtual  selection  ap¬ 
proach  rather  than  the  physical  negative  selection  approach,  to  identify  non- 
redundant  clone  sets  representing  the  prostate  transcriptome.  These  clones  have  been 
extracted  from  the  database  archive,  re-arrayed  into  384-well  microtiter  plates,  am¬ 
plified  by  PCR,  and  spotted  onto  microscope  slides  for  subsequent  hybridization. 
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•  Task  12:  construct  physical  micro-arrays  of  cDN A  clones  on  glass  supports  using 
robotic  tools:  total  of 500  replicates.  See  Task  9  above.  We  are  currently  using  a 
GeneMachines  robotic  spotting  tool  with  the  capability  of  spotting  >18,000  cDNAs 
per  microscope  slide.  More  than  500  replicate  slides  have  been  printed  to  date  com¬ 
prising  the  6,000  prostate  PEDB  cDNAs.  The  current  use  for  these  slides  is  for  the 
analysis  of  amplification  procedures  in  order  to  assess  the  fidelity  of  probe  material 
obtained  from  small  numbers  of  cells. 


Figure  2.  Microarray  hybridization  with  amplified  cDNA.  A  portion  of  the  6,000- 
clone  PEDB  cDNA  microarray  hybridized  with  amplified  cDNA  from  the  LNCaP 
prostate  cancer  cell  line.  A  reference  standard  of  pooled  RNA  from  3  different  prostate 
cancer  cell  lines  serves  as  the  control.  Red  spots  indicated  up-regulated  genes  and  green 
spots  indicate  down-regulated  genes  in  the  experiment  relative  to  control. 


•  Task  13:  assess  alternative  array  methodologies  as  they  become  available  ( ink  jet  oligonu¬ 
cleotide)  This  task  will  be  on-going  for  the  duration  of  the  proposal. 

Technical  objective  3:  To  construct  representative  probes  from  single  or  small  numbers  of  de¬ 
fined  cells  that  are  suitable  for  micro-array  interrogation,  and  retain  the  transcriptome  compo¬ 
sition  ( diversity  and  abundance )  present  in  the  original  cell  type(s). 

•  Task  14:  convert  to  cDNA,  amplify  by  PCR.  and  label  nucleic  acid  from  flow  sorted  cell 
populations  of  decreasing  cell  quantities.  Assess  quality  by  Northern  analysis  and  hybridiza¬ 
tion  to  small  “known  clone”  array.  Compare  with  unamplified  “traditional”  probe,  (months 
12-13).  We  are  not  currently  using  the  flow-cytometry  isolation  approach  due  to  changes  in 
gene  expression  associated  with  tissue  disruption.  Our  focus  now  centers  on  microdissection. 
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•  Task  15:  as  above  with  microdissected  populations,  (months  13-14).  We  have  successfully 
microdissected  prostate  luminal  and  basal  cells  from  10  pm  frozen  sections.  Amplification 
using  the  PCR-based  strategy  incorporating  an  anchored  primer  has  been  successful  in  pro¬ 
ducing  adequate  amounts  of  cDNA  for  probe  construction  and  hybridization.  However,  the 
fidelity  of  the  amplification  in  our  first  attempts  has  been  poor.  This  results  in  the  skewing  of 
message  abundance  levels  in  the  probe  material  relative  to  the  starting  material.  Our  next  ap¬ 
proach  will  be  to  truncate  the  cDNAs  roughly  to  a  common  size  using  a  frequent  cutting  re¬ 
striction  enzyme  followed  by  adapter  ligation  and  subsequent  amplification.  This  approach  is 
designed  to  eliminate  length  bias  that  may  be  skewing  message  ratios. 

•  Task  16:  as  above  with  aRNA  method  and  flow  sorted  cells  (months  15-16).  As  described 
above,  we  are  not  currently  using  flow-sorting  for  cell  isolation  and  probe  construction. 

•  Task  17:  as  above  with  microdissected  populations,  (months  17-18).  We  have  used  a  modifi¬ 
cation  of  the  aRNA  protocol  developed  by  Eberwine  et  al.  We  have  achieved  a  ~100-fold 
amplification  with  a  first  round  aRNA  synthesis  and  an  additional  ~  100-fold  amplification 
with  a  second  round.  This  allows  for  the  use  of  ~0.5  ng  of  total  RNA  for  probe  construction. 
However,  the  aRNA  amplification  is  still  not  suitable  for  the  analysis  of  single,  or  small 
numbers  of  microdissected  cells.  A  second  experiment  using  shorter  RNA  polymerase- 
mediated  synthesis  duration  looks  promising.  These  data  are  currently  undergoing  analysis. 

•  Task  18:  as  above  with  microdissected  populations  from  frozen  and  fixed  tissues,  (months 
19-20).  Work  in  progress. 

•  Task  19:  convert  to  cDNA,  amplify,  label,  and  hybridize  single-cell  probes  to  high-density 
oligonucleotide  arrays,  (months  21-25).  Work  pending. 

•  Task  20:  capture  and  quantitate  hybridization  spot  intensities  on  fluorimage  laser  scanners, 
and  enter  into  database,  (months  21-25).  We  have  contacted  investigators  at  Stanford  Univer¬ 
sity  and  acquired  software  for  the  incorporation  of  cDNA  array  data  into  PEDB.  The  soft¬ 
ware  (database  architecture)  is  currently  being  incorporated  into  the  structure  of  PEDB  and 
should  facilitate  the  storage  and  analysis  of  microarray  data. 


Technical  objective  4:  To  identify  a  cohort  of  cellular  transcripts  which  correlate  with,  define, 

or  “fingerprint”,  a  cellular phenotype(s). 

The  following  work  is  in  progress  or  pending  the  completion  of  Technical  objective  3. 

•  Task  21:  examine  hybridization  intensities  (values)  for  each  datapoint  in  an  automated,  com¬ 
parative  fashion  from  cells  of  a  priori  defined  identical  phenotype  (luminal  epithelium  with 
luminal  epithelium)  to  develop  cohorts  of  phenotype-defining  transcripts,  (months  26-27) 

•  Task  22:  examine  hybridization  intensities  between  cells  with  a  priori  defined  different  phe¬ 
notypes  to  establish  a  lineage  relationship,  (months  26-27) 

•  Task  23:  correlate  expression  profiles  with  known  molecular/biochemical/functional  data 
concerning  each  cell  type  (metastatic  location),  (months  27-28) 

•  Task  24:  analyze  by  DNA  sequencing  cDNAs  which  are  in  phenotype  cohorts  and  have  not 
previously  been  defined,  (months  26-28) 

•  Task  25:  analyze  expression  data  using  cluster  and  phylogeny  algorithms  to  assess  lineage 
relationships,  (months  26-29) 

•  Task  26:  plan  molecular  experiments  and  clinical  evaluation  of  candidate  phenotype-defining 
cohorts:  e.g.l)  retrospective  analysis  of  carcinomas  with  known  clinical  outcomes  (progres- 
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sion/metastasis)  2)  prospective  analysis  diagnostic  needle  biopsy  samples  3)  evaluation  of 
unrecognized  or  “latent”  cancer  samples  obtained  at  autopsy,  (months  27-30) 

•  Task  26:  analyze/compile  data  and  prepare  formal  report  (month  30) 


KEY  RESEARCH  ACCOMPLISHMENTS  (since  the  previous  annual  report) 

•  Obtained  and  purified  single  circulating  neoplastic  prostate  cells  from  the  peripheral  blood  of 
patients  with  prostate  cancer  and  analyzed  exon  8  of  the  androgen  receptor  for  molecular  al¬ 
terations. 

•  Constructed  cDNA  libraries  from  laser-capture  microdissected  prostate  luminal  and  basal 
epithelial  cells  and  prostate  stroma. 

•  Sequenced  and  analyzed  2,000  cDNAs  (producing  ESTs)  from  the  luminal  cell,  basal  cell, 
and  stromal  libraries. 

•  Constructed  cDNA  microarrays  comprised  of  6,000  different  prostate  cDNAs. 

•  Constructed  complex  cDNA  probes  from  microdissected  cells  and  0.5ng  total  RNA  and  used 
the  probe  in  microarray  hybridization  experiments.  The  amplification  fidelity  is  yet  to  retain 
the  appropriate  gene  expression  ratios  suitable  for  experimental  comparisons. 

•  Acquired  database  software  for  archiving  and  analyzing  cDNA  microarray  experiments. 
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CONCLUSIONS 

The  research  accomplished  to  date  has  demonstrated  the  ability  to  reproducibly  isolate  defined 
prostate  cell  populations  by  microdissection  and  flow  cytometry.  Gene  expression  studies  of  the 
cells  purified  by  flow-cytometry  reveal  an  altered  expression  profile  that  we  believe  results  from 
the  tissue  dissociation/dispersion  procedures.  Ongoing  and  future  work  employs  microdissection 
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as  the  procedure  of  choice  for  specific  cell-type  analyses.  The  microdissection  approach  using  a 
laser  capture  microscope  is  an  efficient  procedure  for  isolating  cells  representing  abundant  cell 
types,  and  we  have  isolated,  purified,  and  analyzed  the  gene  expression  profiles  from  luminal 
epithelium  and  stromal  elements.  We  have  greatly  expanded  the  database  of  sequences  acquired 
from  specific  prostate  cell  types,  and  constructed  arrays  encompassing  a  wide  range  of  diverse 
genes  (n=6,000).  In  preliminary  experiments  we  have  used  amplified  cDNA  probes  isolated  from 
small  cell  numbers  to  assess  the  gene  expression  profiles  of  defined  cell  types. 
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Comprehensive  analyses  of  prostate  gene 
expression:  Convergence  of  expressed  sequence 
tag  databases,  transcript  profiling  and  proteomics 

Several  methods  have  been  developed  for  the  comprehensive  analysis  of  gene 
expression  in  complex  biological  systems.  Generally  these  procedures  assess  either  a 
portion  of  the  cellular  transcriptome  or  a  portion  of  the  cellular  proteome.  Each 
approach  has  distinct  conceptual  and  methodological  advantages  and  disadvantages. 
We  have  investigated  the  application  of  both  methods  to  characterize  the  gene  expres¬ 
sion  pathway  mediated  by  androgens  and  the  androgen  receptor  in  prostate  cancer 
cells.  This  pathway  is  of  critical  importance  for  the  development  and  progression  of 
prostate  cancer.  Of  clinical  importance,  modulation  of  androgens  remains  the  mainstay 
of  treatment  for  patients  with  advanced  disease.  To  facilitate  global  gene  expression 
studies  we  have  first  sought  to  define  the  prostate  transcriptome  by  assembling  and 
annotating  prostate-derived  expressed  sequence  tags  (ESTs).  A  total  of  55  000  pros¬ 
tate  ESTs  were  assembled  into  a  set  of  15  953  clusters  putatively  representing  15  953 
distinct  transcripts.  These  clusters  were  used  to  construct  cDNA  microarrays  suitable 
for  examining  the  androgen-response  pathway  at  the  level  of  transcription.  The  expres¬ 
sion  of  20  genes  was  found  to  be  induced  by  androgens.  This  cohort  included  known 
androgen-regulated  genes  such  as  prostate-specific  antigen  (PSA)  and  several  novel 
complementary  DNAs  (cDNAs).  Protein  expression  profiles  of  androgen-stimulated 
prostate  cancer  cells  were  generated  by  two-dimensional  electrophoresis  (2-DE). 
Mass  spectrometric  analysis  of  androgen-regulated  proteins  in  these  cells  identified 
the  metastasis-suppressor  gene  NDKA/nm23,  a  finding  that  may  explain  a  marked 
reduction  in  metastatic  potential  when  these  cells  express  a  functional  androgen  recep¬ 
tor  pathway. 

Keywords:  Prostate  /  T ranscriptome  /  Proteome  /  Androgen  /  Microarray  EL  3957 


1  Introduction 

The  development  and  subsequent  progression  of  human 
prostate  carcinoma  is  propelled  by  the  accumulation  of 
genetic  alterations  and  influenced  by  environmental  fac¬ 
tors.  One  pivotal  mediator  of  prostate  carcinogenesis  is 
the  androgen  receptor  (AR)  pathway.  The  majority  of 
prostate  cancers  initially  require  androgens  for  growth, 
and  the  elimination  of  AR-ligands  by  surgical  or  chemical 
castration  leads  to  marked  tumor  regression  through  a 
mechanism  of  programmed  cell  death  [1].  The  manipula¬ 
tion  of  the  AR  pathway  has  been  used  in  clinical  medicine 
since  the  1940s  as  the  primary  treatment  of  advanced 
prostate  cancer.  However,  this  therapy  is  palliative  and 
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eliminates  the  potential  beneficial  effects  of  androgen- 
induced  cellular  differentiation.  Surviving  cancer  cells  lose 
their  dependence  on  androgens  over  time  and  are  capa¬ 
ble  of  prolifertion  in  the  absence  of  serum  androgens.  The 
molecular  events  leading  to  androgen  independence  (Al) 
have  not  been  defined,  but  potential  mechanisms  include 
overexpression  of  the  AR,  mutations  in  the  AR  gene  lead¬ 
ing  to  promiscuous  ligand  binding,  and  the  activation  of 
the  AR  or  downstream  regulatory  molecules  by  other 
endocrine  or  paracrine  growth  factors  [2,  3]. 

Until  recently,  biological  investigations  have  almost 
entirely  focused  on  the  study  of  individual  genes  and  pro¬ 
teins.  This  has  partly  been  due  to  the  submicroscopic 
nature  and  transient  existence  of  relevant  molecules, 
combined  with  a  lack  of  quantitative  technology  capable 
of  providing  accurate  comprehensive  views  of  biological 
complexity.  Significant  advances  have  been  achieved  k 
studying  individual  genes,  proteins  and  small  numbers  of 
molecular  interactions.  However,  conclusions  made  on 
the  basis  of  the  study  of  an  individual  gene  may  have  lim¬ 
ited  relevance  as  to  how  the  gene  and  gene  product  func¬ 
tion  in  the  context  of  the  whole  cell,  tissue,  or  organism. 

Progress  in  understanding  complex  molecular  processes 


©  WILEY-VCH  Verlag  GmbH,  69451  Weinheim,  2000 


01 73-0835/00/0909-1823  $1 7.50+.50/0 


Proteomics  and  2-DE 


1824 


P.  S.  Nelson  etal. 


Electrophoresis  2000,  21, 1823-1831 


has  been  hampered  by  the  lack  of  a  complete  inventory 
or  “tool-set”  of  all  genes  and  their  cognate  proteins  that 
are  necessary  for  defining  phenotypes  of  normal  and 
pathological  cellular  states. 

The  completion  of  the  Human  Genome  Project  will  pro¬ 
vide  a  foundation  for  a  thorough  description  of  this  molec¬ 
ular  inventory.  More  specifically,  the  gene  inventory  or 
tool  set  required  for  studies  of  prostate  carcinogenesis  is 
that  portion  of  the  human  genome  used  or  expressed  in 
the  human  prostate  gland.  The  subset  of  genes  trans¬ 
cribed  or  expressed  in  a  given  cell  or  tissue  type  such  as 
the  prostate  may  be  defined  as  the  “transcriptome”,  the 
dynamic  link  between  the  genome,  the  proteome,  and  the 
cellular  phenotype  associated  with  physical  characteris¬ 
tics  [4].  Once  a  transcriptome  has  been  described,  the 
next  objective  is  to  understand  the  relationships  of  the 
genes  and  their  protein  products  in  terms  of  a  complex 
system,  e.g.,  biological  pathways  and  networks,  that  may 
define  health  and  disease.  With  this  aim,  novel  technolo¬ 
gies  for  comprehensively  assessing  genomes  and  pat¬ 
terns  of  gene  expression  have  recently  been  developed. 

Our  initial  efforts  have  focused  on  defining  the  prostate 
transcriptome  through  the  production  and  assembly  of 
expressed  sequence  tags  (ESTs)  derived  from  prostate 
complementary  DNA  (cDNA)  libraries  representing  a  wide 
sprectrum  of  normal  and  neoplastic  states.  These  EST 
assemblies  have  been  used  to  construct  cDNA  microar¬ 
rays  suitable  for  interrogating  the  transcriptome  in  experi¬ 
ments  designed  to  examine  specific  biological  pathways 
that  may  be  involved  in  prostate  carcinogenesis.  The  mo¬ 
lecular  pathway  mediating  androgenic  hormone  action  on 
prostate  cells  is  a  specific  focus  of  our  work.  The  func¬ 
tional  architecture  of  prostate  gene  networks  is  furth  eluci¬ 
dated  by  our  next  level  of  analysis  that  incorporates  stud¬ 
ies  of  the  prostate  proteome.  Analysis  of  the  transcrip¬ 
tome  facilitates  proteome  studies  by  providing  a  compre¬ 
hensive  prostate  sequence  database  for  identifying  and 
annotating  known  and  unknown  proteins  displayed  by 
two-dimensional  gel  electrophoresis  (2-DE)  and  analyzed 
by  mass  spectrometry  (MS).  Our  objectives  for  delineat¬ 
ing  the  molecular  network(s)  influenced  by  AR  activation 
are  to  identify  specific  targets  that  promote  the  differentia¬ 
tion  and  apoptotic  potential  of  prostate  cancer  cells  while 
reducing  their  ability  to  proliferate. 

2  Materials  and  methods 

2.1  Assembly  of  a  prostate  transcriptome: 
Prostate  Expression  Database  (PEDB) 

A  prostate  transcriptome  was  assembled  from  ESTs  de¬ 
rived  from  cDNA  libraries  representing  a  wide  sprectrum 


of  normal,  benign,  and  malignant  prostate  tissues.  A 
detailed  description  of  the  assembly  and  annotation  pro¬ 
cedure  is  described  elsewhere  [5].  Briefly,  individual 
ESTs,  detailed  cDNA  library  information,  and  sequence 
annotations  were  loaded  into  a  relational  database  (Ora¬ 
cle  Corp.)  termed  the  Prostate  Expression  Database 
(PEDB).  Prostate  ESTs  used  for  the  assembly  were  gen¬ 
erated  in  our  laboratory  as  previously  described  [6].  Addi¬ 
tional  public  domain  ESTs  of  prostate  origin  were  ob¬ 
tained  from  Genbank  (http://www.ncbi.nlm.nih.gov/Entrez/ 
batch.html),  the  NCI  Cancer  Genome  Anatomy  Project 
(CGAP)  [7],  and  The  Institute  for  Genome  Research 
(TIGR)  (http://www.tigr.org).  Each  EST  was  examined  for 
sequence  homology  to  cloning  vectors,  Escherichia  coli, 
and  repetitive  DNA  sequences  using  a  core  program 
called  AnalDemon  (http://www.mbt.washington.edu/PE 
DB/software).  AnalDemon  first  employs  Cross_Match 
(http://bozeman.mbt.washington.edu/phrap.docs/general. 
html);  a  program  based  on  the  Smith-Waterman-Gotoh 
algorithm,  to  screen  for  vector  and  E.  coli  genome  con¬ 
tamination.  ESTs  are  then  examined  for  interspersed 
repeats  and  regions  of  low  sequence  complexity  using 
Repeatmasker  (http://ftp.genome.washington.edu/RM/Re 
peatMasker.html).  Specific  portions  of  EST  sequences 
exhibiting  homology  to  any  of  these  unwanted  elements 
are  masked  in  order  to  eliminate  the  sequence  from  con¬ 
tributing  to  an  assembly  process.  CAP2  [8],  a  multiple 
alignment  program  based  on  a  variant  of  the  Smith  and 
Waterman  algorithm,  was  used  for  assembling  ESTs  into 
homologous  groups  or  clusters.  Clustering  is  based  on 
maximal  scoring  of  overlapping  alignments  and  allows  for 
general  substitutions  resulting  from  sequencing  errors, 
insertions,  and  deletions.  CAP2  produces  a  consensus 
sequence  and  allows  varying  sensitivity  and  overlap 
parameters.  Each  group  or  cluster  of  ESTs  exhibiting  sig¬ 
nificant  homology  with  one  another  is  termed  a  species. 
Thus,  a  species  is  a  sequence  or  group  of  sequences  that 
is  unique  relative  to  the  nucleotide  sequence  of  other 
groups  of  sequences,  and  each  is  given  a  unique  PEDB 
Species  Identification  number  (SID).  The  SID  provides  a 
means  to  analyze  gene  expression  across  the  entire  set 
of  assemblies,  and  can  be  used  to  provide  a  library-by¬ 
library  species-specific  differential  expression  profile. 
Each  distinct  species  from  the  clustering  process  was 
annotated  by  searching  the  Unigene  (ncbi.nlm.nih.gov  in  / 
pub/schuler/unigene),  Genbank  (ncbi.nlm.nih.gov/blast/ 
db/nt.Z),  and  EST  databases  (ncbi.nlm.nih.gov/blast/db/ 
est.Z)  using  BLASTN  (http://blast.wustl.edu).  Annotations 
were  assigned  automatically  using  the  program  Smart- 
Blast  (http://www.mbt.washington.edu/PEDB/software)  by 
selecting  the  database  match  with  the  lowest  p  value  and 
the  highest  blast  score  where  the  maximum  p  value  is 
e“20  and  the  minimum  blast  score  is  500. 
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2.2  Prostate  transcriptome  analyses  by  cDNA 
microarray 

2.2.1  Microarray  fabrication 

A  nonredundant  set  of  1500  prostate-derived  cDNA 
clones  was  identified  from  the  prostate  transcriptome 
archived  in  PEDB.  Individual  clone  inserts  were  amplified 
by  the  PCR  using  2  pL  of  bacterial  transformant  culture 
as  template  with  primers  BL_m13F  (S'-GTAAAACGA- 
CGGCC AGT G AATT G-3')  and  BL_m13R  (5'-ACACAGG- 
AAACAGCTATGACCATG-3'  as  previously  described  [6]. 
PCR  products  were  purified  through  Sephacryl  S500 
(Amersham  Pharmacia  Biotech,  Uppsala,  Sweden), 
mixed  1:1  with  dimethyisulfoxide,  and  spotted  in  duplicate 
onto  coated  Type  VII  glass  microscope  slides  (Amersham 
Pharmacia  Biotech)  using  a  Molecular  Dynamics  (Sunny¬ 
vale,  CA^  USA)  Genii  robotic  spotting  tool.  After  spotting, 
the  glass  slides  were  air-dried  and  UV-cross-linked  with 
500  mJ  of  energy  and  then  baked  at  95°C  for  30  min. 

2.2.2  Probe  construction  and  microarray 
hybridization 

Total  RNA  was  isolated  from  the  androgen-responsive 
LNCaP  prostate  cancer  cells  [9]  at  time  points  of  0,  4,  8, 
24,  and  72  h  after  androgen  depletion  or  supplementation 
using  TRIzol  (Life  Technologies,  Paisley,  UK)  according 
to  the  manufacturer's  directions.  Fluorescence-labeled 
probes  were  made  from  30  pg  of  total  RNA  in  a  reaction 
volume  of  20  pL  containing  1  pL  anchored  oligo-dT  primer 
(Amersham  Pharmacia  Biotech),  0.05  itim  Cy3-dCTP 
(Amersham  Pharmacia  Biotech),  0.05  mM  dCTP,  0.1  mM 
each  dGTP,  dATP,  dTTP,  and  200  U  Superscript  II 
reverse  transcriptase  (Life  Technologies).  Reactants 
were  incubated  at  42°C  for  120  min  followed  by  heating  to 
94°C  for  3  min.  Unlabeled  RNA  was  hydrolyzed  by  the 
addition  of  1  pL  of  5  n  NaOH  and  heating  to  37°C  for 
10  min.  One  pL  of  5  m  HCI  and  5  pL  of  1  m  Tris-HCI, 
pH  7.5,  were  added  to  neutralize  the  base.  Unincorpo¬ 
rated  nucleotides  and  salts  were  removed  by  chromatog¬ 
raphy  (Qiagen,  Chatsworth,  CA,  USA),  and  the  cDNA 
was  eluted  in  30  pL  dH20.  One  pg  of  dA/dT  12-18 
(Amersham  Pharmacia  Biotech)  and  1  pg  of  human  Cotl 
DNA  (Life  Technologies)  were  added  to  the  probe,  heat- 
denatured  at  94°C  for  5  min,  combined  with  an  equal  vol¬ 
ume  of  2  x  microarray  hybridization  solution  (Amersham 
Pharmacia  Biotech)  and  prehybridized  at  50°C  for  1  h. 
The  mixture  was  then  placed  onto  a  microarray  slide  with 
a  coverslip  and  hybridized  in  a  humid  chamber  at  52°C 
for  16  h.  The  slides  were  washed  once  with  1  X  sodium 
chloride  and  sodium  citrate  (SSC),  0.2%  SDS  at  room 
temperature  for  5  min  and  then  twice  with  0.1  x  SSC, 
0.2%  SDS  at  room  temperature  for  10  min.  After  washing, 
the  slide  was  rinsed  in  distilled  water  to  remove  trace  salts 
and  dried. 


2.2.3  Image  acquisition  and  data  analyses 

Fluorescence  intensities  of  the  immobilized  targets  were 
measured  using  a  laser  confocal  microscope  (Molecular 
Dynamics).  Intensity  data  were  integrated  at  a  pixel  reso¬ 
lution  of  10  pm  using  approximately  20  pixels  per  spot, 
and  recorded  at  16  bits.  Quantitative  data  were  obtained 
with  the  SpotFinder  Version  2.4  program  written  at  the 
University  of  Washington.  Local  background  hybridization 
signals  were  subtracted  prior  to  comparing  spot  intensi¬ 
ties  and  dtermining  expression  ratios.  For  each  experi¬ 
ment,  each  cDNA  was  represented  twice  on  each  slide, 
and  the  experiments  were  performed  in  duplicate  produc¬ 
ing  four  data  points  per  cDNA  clone  and  hybridization 
probe.  Intensity  ratios  for  each  cDNA  clone  hybridized 
with  probes  derived  from  androgen-stimulated  LNCaP 
and  androgen-starved  LNCaP  cells  were  calculated 
(stimulated  intensity/starved  intensity).  Gene  expression 
levels  were  considered  significantly  different  between  the 
two  conditions  if  all  four  replicate  spots  for  a  given  cDNA 
demonstrated  a  ratio  >  2  or  <  0.5,  and  the  signal  intensity 
was  greater  than  two  standard  deviations  above  the 
image  background.  We  have  previously  determined  that 
expression  ratios  less  than  1 .5  are  not  reproducible  in  our 
system  (datas  not  shown). 

2.3  Prostate  proteome  analyses  by  2-DE  and 
MS 

2.3.1  2-DE 

LNCaP  prostate  cancer  cells  were  grown  under  condi¬ 
tions  of  androgen  stimulation  or  androgen  starvation  as 
described  above.  M12AR  cells,  a  highly  metastatic  pros¬ 
tate  cancer  cell  line  derived  from  the  serial  passaging  of 
SV40  immortalized  prostate  epithelial  cells  [10]  and  trans¬ 
fected  with  the  AR  were  grown  in  serum-free  DMEM  high- 
glucose  media  (Life  Technologies)  supplemented  with 
insulin,  transferrin,  selenium,  and  dexamethasone  as  pre¬ 
viously  described  [11].  Cells  were  allowed  to  reach  80% 
confluency  and  then  treated  for  24  h  with  the  same  media 
supplemented  with  10  nM  R1881.  Cells  were  washed 
once  with  PBS,  scraped  from  plates  with  a  rubber  police¬ 
man  and  pelleted  by  centrifugation.  Protein  was  har¬ 
vested  as  described  by  Garrels  and  Franza  [12].  Briefly, 
cell  pellets  were  lysed  in  a  buffer  containing  0.3%  SDS, 
1%  p-mercaptoethanol,  and  50  mM  Tris-HCI,  pH  8.0, 
100  pg/mL  DNAase  I,  50  pg/mL  RNAase  A,  5  mM  MgCI2> 
and  heated  for  1  min  at  100°C.  Harvested  protein  was 
lyophilized,  resuspended  in  isoelectric  focusing  (IEF)  gel 
rehydration  solution,  and  stored  at  -80°C.  Soluble  pro¬ 
teins  were  run  in  the  first  dimension  by  using  a  commer¬ 
cial  flatbed  electrophoresis  system  (Multiphor  II;  Amer¬ 
sham  Pharmacia  Biotech).  Nonlinear  immobilized  pH 
gradient  (IPG)  dry  strips  ranging  from  3.0  to  10.0  (Amer- 
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sham  Pharmacia  Biotech)  were  used  for  the  first-dimen¬ 
sional  separation.  Forty  micrograms  of  protein  from 
whole-cell  lysates  were  mixed  with  IPG  strip  rehydration 
buffer  (8  m  urea,  2%  Nonidet  P-40,  10  mM  dithiothreitol), 
and  250-380  jiL  of  solution  (13  and  18  cm  IPGs,  respec¬ 
tively)  was  added  to  individual  lanes  of  an  IPG  strip  rehy¬ 
dration  tray  (Amersham  Pharmacia  Biotech).  The  strips 
were  rehydrated  at  room  temperature  for  1  h.  The  sam¬ 
ples  were  run  at  300  V,  10  mA,  5  W  for  2  h,  ramped  to 
3500  V,  10  mA,  5  W  over  a  period  of  3  h,  and  then  kept  at 
3500  V,  10  mA,  5W  for  15-19  h.  Following  IEF  (60- 
70  kVh),  the  IPG  strips  were  first  reequilibrated  for  8  min 
in  a  solution  of  2%  w/v  dithiothreitol,  2%  w/v  SDS,  6  m 
urea,  30%  w/v  glycerol,  0.05  m  Tris-HCI  (pH  6.8)  and  sub¬ 
sequently  for  4  min  in  a  solution  of  2.5%  w/v  iodoaceta- 
mide,  2%  w/v  SDS,  6  m  urea,  30%  w/v  glycerol,  0.05  m 
Tris-HCI  (pH  6.8)  with  a  trace  of  bromophenol  blue  added 
for  color.  Following  reequilibration,  the  strips  were  trans¬ 
ferred  and  apposed  to  10%  polyacrylamide  second¬ 
dimensional  gels.  Polyacrylamide  gels  were  poured  in 
casting  stand  with  10%  acrylamide-2.67%  piperazine  di¬ 
acrylamide-0.375  m  Tris,  pH  8.8,  0.1%  w/v  SDS,  0.05% 
w/v  ammonium  persulfate,  0.05%  TEMED  (A/,A/,A/',A/'- 
tetramethylethylenediamine)  in  Milli-Q  water  (Millipore, 
Bedford,  MA,  USA).  Second-dimensional  gels  (0.1  X  20  X 
20  cm)  were  run  in  an  apparatus  supplied  by  Oxford  Gly- 
cosciences  (Abington,  UK).  Once  the  IPG  strips  were 
apposed  to  the  second-dimensional  gels,  they  were 
immediately  run  at  a  constant  current  of  50  mA  at  500  V 
and  85  W  for  20  min,  followed  by  a  constant  current  of 
200  mA  at  500  V  and  85  W  until  the  buffer  front  was 
10-15  mm  from  the  bottom  of  the  gel.  Gels  were  removed 
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and  silver  stained  according  to  the  procedure  of  Blum 
et  al  [13]. 

2.3.2  Protein  identification  by  tandem  mass 
spectrometry 

Protein  spots  from  gels  were  identified  by  tandem  mass 
spectrometry  (MS/MS)  as  previously  described  [14]. 
Spots  from  silver-stained  gels  were  excised  and  in-gel 
tryptic  peptides  were  separated  by  microcapillary  LC 
(jiLC)  coupled  to  a  tandem  mass  spectrometer 
(TSQ  7000;  Finnigan,  San  Jose,  CA).  Peptide  fragmenta¬ 
tion  spectra  were  generated  in  a  data-dependent  fashion. 
Spectra  were  searched  against  the  composite  OWL  pro¬ 
tein  sequence  database  by  using  the  computer  program 
SEQUEST  [15]  and  against  the  PEDB.  A  protein  match- 
was  determined  by  comparing  the  number  of  peptides 
identified  and  their  respective  cross-correlation  scores. 
Protein  identifications  were  verified  by  comparison  with 
theoretical  molecular  weights  and  isoelectric  points. 

3  Results  and  discussion 

3.1  Prostate  gene  expression  analyses:  EST 
assemblies  and  annotation 

ESTs  produced  from  cDNA  libraries  derived  from  normal 
and  neoplastic  human  prostate  tissue  samples  were 
entered  into  the  PEDB,  an  Oracle  relational  database  run¬ 
ning  on  a  Sun  SPARC  workstation.  The  most  recent 
PEDB  build  was  assembled  starting  with  55  000  prostate 
ESTs  produced  from  42  cDNA  libraries.  Portions  of  EST 
sequences  with  homology  to  cloning  vector,  E.  coli 
genomic  DNA,  and  human  repetitive  DNA  sequences 
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Figure  1.  Assembly  of  a  pros¬ 
tate  transcriptome.  (A)  55  000 
prostate  ESTs  were  examined 
for  “junk”  sequences  leaving 
49  816  high  quality  ESTs  suita¬ 
ble  for  assembly.  Clustering  the 
ESTs  into  groups  of  high  homol¬ 
ogy  produced  a  set  of  21  114 
clusters  that  were  annotated 
against  nucleotide  and  protein 
sequences  in  the  public  se¬ 
quence  databases.  Clusters 
exhibiting  homology  to  Genbank 
sequences  were  also  examined 
for  homology  to  Unigene  se¬ 
quences  (UG)  to  further  col¬ 
lapse  clusters  into  homologous 
groups.  (B)  Following  clustering, 
database  annotations  and  reclustering,  a  total  of  15  953  distinct  prostate  EST  species  were  identified.  More  than  2000 
prostate  species  did  not  have  homology  to  nonprostate-derived  sequences  in  the  public  databases  (unannotated). 
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were  masked  and  ESTs  with  >  100  bp  of  high  quality 
sequence  were  admitted  to  the  assembly  process 
(Fig.  1A).  A  total  of  49  816  high  quality  ESTs  were 
assembled  using  the  sequence  assembly  program  CAP2 
to  produce  21  114  clusters.  Each  cluster  was  annotated 
by  searching  the  Unigene,  Genbank,  and  dbEST  data¬ 
bases  with  the  CAP2-generated  cluster  consensus 
sequences  using  BLASTN.  Clusters  annotating  to  the 
same  database  sequence  were  joined  to  further  reduce 
the  number  of  distinct  clusters  to  15  953  (Fig.  IB). 

Studies  in  the  1970s  using  reassociation  kinetics  to  esti¬ 
mate  the  number  of  different  transcripts  indicate  that  be¬ 
tween  10  000  and  30  000  distinct  mRNAs  are  present  in 
mammalian  cells  or  organs  [16,  17].  Recent  data  pro¬ 
duced  using  the  method  of  Serial  Analysis  of  Gene 
Expression  (SAGE)  suport  these  estimates  of  transcript 
diversity  in  mammalian  epithelial  cells  with  estimates  of 
14  000-20  000  different  mRNAs  per  cell  [18].  Although 
the  identification  of  alternatively  spliced  transcripts  and 


highly  homolgous  gene  family  members  may  increase  or 
decrease  these  estimates  slightly,  they  nevertheless  pro¬ 
vide  a  rough  estimate  of  the  complexity  of  cellular  gene 
activity.  Based  upon  these  data,  the  15  953  prostate  EST 
clusters  that  we  have  assembled  should  characterize 
roughly  50-75%  of  the  prostate  transcriptome.  It  is  likely 
that  this  assembled  dataset  comprises  all  of  the  abundant 
and  most  of  the  moderately  abundant  prostate  transcripts 
[6].  Ongoing  work  involves  the  acquisition  of  the  remain¬ 
ing  low  abundance  transcripts.  Approaches  to  achieving 
this  goal  involve  the  construction  of  cDNA  libraries  from 
highly  selected  purified  cell  populations  such  as  luminal 
epithelial  and  neuroendocrine  cells,  and  from  prostate  tis¬ 
sues  at  different  stages  of  development  ( e.g .,  fetal  pros¬ 
tate)  or  under  different  hormonal  influences  (e.g.,  andro¬ 
gen  stimulation).  Another  useful  strategy  involves  the 
iterative  removal  of  abundant  and  previously  identified 
cDNAs  in  order  to  select  for  rare  species.  A  high-through¬ 
put  method  using  cDNA  array-based  technology  has 
been  developed  to  facilitate  this  process  [19]. 
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Figure  2.  cDNA  microarray  analysis  of  prostate  androgen-regulated  gene  expression.  A  nonredundant  clone  set  comprised 
of  1536  cDNAs  was  hybridized  with  Cy3-labeled  (red)  cDNA  from  androgen-stimulated  LNCaP  cells  and  Cy5  labeled 
(green)  cDNA  from  androgen-starved  LNCaP  cells.  The  expression  ratio  for  each  cDNA  was  determined  and  the  ratios  for 
all  cDNAs  with  signal  intensities  2.33-fold  above  the  standard  deviation  of  the  background  signal  were  clustered  according 
to  transcript  levels  over  time.  The  Cluster  and  TreeView  software  programs  availabe  at  the  Stanford  genome  web  site  was 
used  for  the  analysis  (http://rana.Stanford.EDU/software/).  Twenty  genes  were  identified  with  increased  expression  after 
androgen  stimulation. 
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3.2  Prostate  gene  expression  analyses: 
cDNA  microarray 

Microarrays  comprised  of  1500  distinct  prostate-derived 
cDNAs  were  hybridized  with  fluorescently  labeled  total 
cDNA  probes  produced  from  androgen-stimulated  and  an¬ 
drogen-starved  LNCaP  prostate  cancer  cells.  No  cDNAs 
were  identified  whose  expression  level  decreased  with 
androgen  stimulation.  In  contrast,  the  hybridization  ratios 
of  20  different  cDNAs  were  consistently  increased  by  >  2- 
fold  in  androgen-stimulated  relative  to  androgen-starved 
cells  (Fig.  2).  This  group  included  cDNAs  encoding  the 
human  glandular  kallikrein  2  (hK2)  and  human  glandular 
kallikrein  3  (hK3),  also  known  as  prostate-specific  antigen 
(PSA).  The  regulation  of  hK2  and  PSA  has  previously 
been  shown  to  be  mediated  by  androgens  through  a 
mechanism  involving  androgen-response  element  (ARE) 
binding  sites  in  the  promoter  regions  of  these  genes  [20, 
21]. 

In  addition  to  hK2  and  PSA,  we  identified  several  other 
genes  previously  shown  to  be  androgen-regulated, 
including  the  prostate  homeobox  gene  NKX3.1  [22],  the 
serine  protease  prostase/PRSS17  [23],  and  two  genes 
involved  in  lipid  metabolism.  The  microarray  analysis  also 
indicated  that  the  expression  of  the  membrane-bound 
serine  protease  TMPRSS2  [24]  was  regulated  by  andro¬ 
gen.  We  subsequently  confirmed  the  androgen  regulation 
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of  TMPRSS2  by  Northern  analysis,  identified  a  putative 
ARE  in  the  TMPRSS2  promoter  region,  and  demon¬ 
strated  that  TMPRSS2  is  highly  expressed  in  the  prostate 
gland  relative  to  other  human  tissues  [25],  Several  cDNAs 
corresponding  to  uncharacterized  genes  also  exhibited 
transcriptional  regulation  by  androgen  (Fig.  2).  We  have 
cloned  the  full-length  cDNA  and  confirmed  the  androgen 
regulation  of  one  of  these  novel  sequences  and  desig¬ 
nated  it  as  PART-1,  for  Prostate  Androgen-Regulated 
Transcript-1,  as  it  lacks  significant  homology  to  nucleotide 
or  protein  sequences  in  the  nonredundant  subdivision  of 
the  GenBank  and  SWISS-Prot  databases  [26].  Interest¬ 
ingly,  the  tissue  pattern  of  PART-1  expression  is  also 
essentially  restricted  to  the  prostate.  The  cloning  and 
characterization  of  the  other  identified  androgen-regu¬ 
lated  cDNAs  is  in  progress. 

We  anticipate  that  expanding  these  studies  to  include  a 
greater  portion  of  the  prostate  transcriptome  coupled  with 
experiments  designed  to  determine  direct  versus  indirect 
transcriptional  regulation,  and  ultimately  translational  and 
post-translational  regulation  of  these  genes,  will  establish 
a  framework  for  understanding  the  cellular  functions 
mediated  by  androgens.  Despite  the  important  influence 
of  androgenic  hormones  on  prostate  cancer  growth,  rela¬ 
tively  few  downstream  targets  of  the  AR  pathway  have 
been  described.  Studies  designed  to  identify  genes  regu- 
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Figure  3.  (Left)  LNCaP  2-DE  protein  expression  profile  with  androgen  stimulation.  (Right)  Three-step  schema  for  protein 
identification  using  MS  and  computer  sequence  database  searching. 
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Figure  4.  Identification  of  an  androgen-regulated  protein  from  metastatic  prostate  cancer  cells  by  2-DE  and  MS.  M12AR 
cells  were  (A)  starved  or  (B)  stimulated  for  24  h  with  the  synthetic  androgen  R1881  and  total  cell  lysates  (40  jug  each)  were 
subjected  to  2-DE.  Protein  expression  profiles  were  compared  and  proteins  demonstrating  a  qualitative  expression  level 
differences  were  subjected  to  in-gel  trypsin  digestion,  and  identified  by  pLC-MS/MS  analysis.  (C),  (D),  MS/MS  spectrum  of 
identified  peptide,  peptide  sequence,  and  identified  ion  series.  (E)  Results  from  correlation  of  acquired  peptide  fragmenta¬ 
tion  spectra  with  database  entries  (using  SEQUEST  software).  The  MS/MS  spectrum  in  (D)  was  identified  as  NDKAJHU- 
MAN  (nm23)  taken  from  the  selected  2-D  gel  spot.  Two  additional  peptides  were  identified  from  this  protein  in  a  single  run. 
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lated  by  androgens  in  the  rat  prostate  determined  that 
androgens  increase  the  transcription  of  about  56  genes 
and  decrease  the  transcription  of  less  than  1 0  genes  [27]. 
From  a  therapeutic  standpoint,  it  would  be  extremely  use- 
ful  to  distinguish  and  subsequently  modulate  the  relevant 
molecules  in  the  AR  program  that  mediate  the  divergent 
processes  of  cellular  proliferation,  cellular  differentiation, 
and  apoptosis. 

3.3  Prostate  gene  expression  analyses: 

2-DE  and  MS 

To  complement  our  prostate  transcriptional  data  and  pro¬ 
vide  a  more  complete  picture  of  prostate  gene  expres¬ 
sion,  we  have  undertaken  a  comprehensive  analysis  of 
that  portion  of  the  prostate  proteome  regulated  by  andro¬ 
genic  hormones.  Reference  protein  expression  profiles 
were  produced  for  the  LNCaP  and  M12AR  prostate  can¬ 
cer  cell  lines  using  2-DE  protein  separation  techniques 
under  steady-state  conditions  (Fig.  3).  Protein  expression 
profiles  from  cell  lysates  under  conditions  of  androgen 
stimulation  and  androgen  starvation  have  also  been  gen¬ 
erated.  A  comparison  of  2-DE  protein  profiles  under  these 
various  conditions  yielded  a  proteomic  signature  charac¬ 
terized  by  a  subset  of  proteins  with  qualitative  and  quanti¬ 
tative  changes.  Individual  proteins  were  identified  using  a 
sequential  process  of  in-gel  trypsin  digestion  and  extrac¬ 
tion,  peptide  separation  by  pLC,  generation  of  MS/MS 
spectra,  and  database  correlation  with  the  acquired  pep¬ 
tide  fragmentation  pattern  (Fig.  3). 

A  comprehensive  analysis  of  androgen-induced  proteo¬ 
mic  signatures  is  ongoing  and  our  initial  experiments 
demonstrate  the  utility  of  this  approach  in  identifying  mol¬ 
ecules  of  potential  importance  in  understanding  andro¬ 
gen-mediated  regulation  of  prostate  cancer  progression 
and  metastasis.  Figure  4  depicts  a  portion  of  the  2-DE 
protein  profile  from  androgen-starved  and  androgen- 
stimulated  M12AR  prostate  cancer  cells  with  a  differen¬ 
tially  expressed  protein  spot  that  is  upregulated  in  M12AR 
cells  after  exposure  to  androgens.  This  protein  was  identi¬ 
fied  as  human  nucleoside  diphosphate  kinase  A  (NDKA/ 
nm23),  a  well-characterized  gene  with  tumor  metastasis 
suppressor  activity  in  several  different  human  tumors 
including  melanoma,  breast,  ovary  and  prostate  [28,  29]. 
Transfection  of  the  DU-145  prostate  cancer  cell  line  with 
NDKA/nm23  inhibited  the  adhesion  to  cell  matrix  and 
impaired  colony  growth  in  soft  agar  [29]. 

The  Ml 2  prostate  cancer  cell  line  is  highly  tumorigenic 
when  implanted  into  nude  mice  and  metastasizes  to  dif¬ 
ferent  anatomical  sites.  Transfection  of  these  cells  with  a 
functional  androgen  receptor  (M12AR)  markedly  de¬ 
creases  the  proliferation  rate,  tumor  growth,  invasive¬ 


ness,  and  in  vivo  metastatic  potential  when  these  cells 
are  injected  into  the  prostate  glands  of  nude  mice  (S.  Ply- 
mate,  unpublished  observation).  NDKA/nm23  transcripts 
have  been  shown  to  increase  rapidly  in  prostate  cancer 
cell  lines  after  the  administration  of  androgens,  though  no 
functional  ramifications  of  this  increased  expression  were 
described  [30]. 

A  possible  mechanism  for  the  decreased  tumorigenic  and 
metastatic  capability  of  M12AR  cells  compared  with  M12 
cells  lacking  the  AR  involves  the  upregulation  of  NDKA/ 
nm23  by  androgens  through  a  functional  androgen-re¬ 
sponse  program  restored  by  the  AR  transfection  and 
expression.  Such  an  observation  has  direct  clinical  rele¬ 
vance.  Both  human  and  in  vitro  studies  suggest  that  there 
may  be  a  survival  benefit  from  maintaining  an  androgen 
responsive  cohort  of  prostate  tumor  cells  [31-33].  This 
concept  has  been  studied  in  the  LNCaP  cell  system  by 
comparing  the  rate  of  tumor  growth  in  castrated  mice 
implanted  with  LNCaP  cells  with  subsequent  tumor 
growth  (i)  without  further  therapy,  or  (ii)  followed  by  inter¬ 
mittent  androgen  replacement.  The  rate  of  tumor  growth 
as  measured  by  serum  PSA  was  slower  in  animals 
treated  with  intermittent  androgen  supplementation  com¬ 
pared  to  those  maintained  in  the  castrated  state  [31]. 

4  Concluding  remarks 

The  results  presented  here  demonstrate  the  utility  of 
global  expression  studies  to  simultaneously  identify  multi¬ 
ple  genes  and  gene  products  of  biological  relevance  that 
participate  in  specific  metabolic  pathways.  Both  known 
and  unknown  genes  are  rapidly  identified.  Notable  advan¬ 
tages  of  the  microarray-based  transcript  profiling  ap¬ 
proach  include  the  ability  to  perform  detailed  time-course 
or  variable  drug-dose  experiments  in  a  robust  economical 
fashion.  Controlled  replicate  experiments  can  determine 
system  and  procedural  errors.  However,  this  approach  is 
absolutely  dependent  upon  the  identification  of  diverse 
clone  sets  for  array  construction  that  are  biologically  rele¬ 
vant  to  the  system  under  study.  In  addition,  a  significant 
limitation  of  transcript  profiling  methods  is  the  lack  of  a 
tight  correlation  between  gene  activity  as  measured  by 
mRNA  level,  and  protein  abundance  [34].  Global  protein 
analyses  focus  on  the  actual  biological  effector  mole¬ 
cules,  but  are  restricted  by  difficulties  in  detecting  low 
abundance  proteins,  accurately  measuring  the  differ¬ 
ences  in  protein  levels  between  two  samples,  and  a 
dependency  on  comprehensive  annotated  sequence 
databases  for  protein  identification. 

Integrating  the  assembly  and  annotation  of  sequence 
databases  with  transcript  profiling  and  proteome  analyses 
combines  complementary  robust  approaches  that  capital- 
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ize  on  the  strengths  and  avoid  the  limitations  of  relying  on 
one  method.  The  further  expansion  of  this  work  to  include 
the  analysis  of  the  entire  prostate  transcriptome  coupled 
with  quantitative  proteome  studies  should  enable  the 
characterization  of  gene  networks  and  cellular  pathways 
that  can  be  exploited  for  therapeutic  intervention. 
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ABSTRACT 

Identification,  acquisition,  and  assessment  of  molecular  markers  that  could  be  adopted  as  surrogate 
endpoints  for  evaluating  a  response  to  prostate  cancer  intervention  strategies  is  highly  desirable.  Recent 
advances  in  the  fields  of  genomics  and  biotechnology  have  dramatically  increased  the  quantity  and  acces¬ 
sibility  of  molecular  information  that  is  relevant  to  the  study  of  prostate  carcinogenesis.  One  major  advance 
involves  the  construction  of  comprehensive  databases  that  archive  gene  sequences  and  gene  expression 
data.  This  information  is  in  a  format  suitable  for  virtual  queries  designed  to  distinguish  the  molecular 
differences  between  normal  and  cancer  cells.  A  second  major  advance  uses  robotic  tools  to  construct 
microarrays  comprising  thousands  of  distinct  genes  expressed  in  prostate  tissues.  Such  arrays  offer  a 
powerful  approach  for  monitoring  the  expression  of  thousands  of  genes  simultaneously  and  provide  access 
for  techniques  designed  to  assess  patterns  or  “fingerprints”  of  gene  expression  that  may  ultimately  be  used 
as  signatures  of  response  to  therapeutic  intervention.  UROLOGY  57  (Suppl  4A):  1 54-1 59,  2001 .  ©  2001, 
Elsevier  Science  Inc. 


The  human  genome  is  estimated  to  comprise 
approximately  30,000  to  100,000  genes.  To 
confer  developmental  and  functional  specificity, 
only  a  fraction  of  this  total  is  active  in  a  given  cell 
type  at  a  given  time,  and  these  expressed  genes 
essentially  define  the  state  of  that  cell.  The  molec¬ 
ular  profile  of  normal  and  cancer  cells,  ie,  their  set 
of  expressed  genes,  differs  in  both  qualitative  (al¬ 
ternative  forms  of  a  gene)  and  quantitative  fash¬ 
ions.  Measurement  of  this  profile  may  predict  the 
phenotypic  behavior  of  such  cells  more  accurately 
than  traditional  histologic  approaches. 

To  identify  informative  biomarkers  and  suitable 
intermediate  endpoints  of  disease,  it  would  be  ad¬ 
vantageous  to  have  a  catalog  or  index  of  all  genes 
and  their  cognate  proteins  that  are  expressed  in 
normal  and  neoplastic  prostate  tissues.  This  re¬ 
source  could  then  be  rapidly  exploited  to  identify 
candidate  biomarkers  for  evaluation  based  on  ho- 
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mology  to  known  genes  of  importance  in  prostate 
cancer,  gene  polymorphisms  and  mutations,  or  al¬ 
terations  in  gene  expression.  This  review  will  focus 
particularly  on  the  use  of  tissue-specific  expressed 
sequence  tag  (EST)  databases,  the  development 
and  use  of  cDNA  microarrays,  and  statistical  issues 
related  to  microarray  analyses.  These  approaches 
may  become  essential  for  identifying  new  biomar¬ 
kers  in  prostate  cancer. 

DATABASES  AS  TOOLS  FOR  BIOMARKER 
IDENTIFICATION 

In  1997,  the  National  Cancer  Institute  an¬ 
nounced  a  bold  new  initiative,  the  Cancer  Genome 
Anatomy  Project  (CG AP),  with  the  overall  goal  of 
achieving  the  comprehensive  molecular  character¬ 
ization  of  normal,  precancerous,  and  cancerous 
cells.1"3  The  CGAP  is  an  interdisciplinary  program 
that  uses  National  Institutes  of  Health  intramural 
research  teams,  academic  centers,  and  commercial 
resources  to  establish  an  index  of  genes  expressed 
in  tumors.  The  CGAP  serves  as  an  interface  be¬ 
tween  genomics  and  cancer  research.  The  new 
technologies  supported  by  this  initiative,  and  the 
products  resulting  from  these  technologies,  will  be 
accessible  to  the  public  through  an  Internet  web¬ 
site  (http://www.cgap.nci.nih.gov).  This  Internet 
site  provides  information  about  cDNA  libraries  of 
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FIGURE  1.  Gene  expression  profiles  in  normal ,  precancerous,  and  malignant  prostate  tissue .  Differences  in  gene 
expression  between  cDNA  libraries  prepared  from  various  types  of  prostate  tissues  can  be  analyzed  by  using  the 
Digital  Differential  Display  (DDD)  software  program.  Library  A  is  prepared  from  normal  prostate  epithelium ,  Library 
B  from  precancerous  prostate  tissue,  Library  C  from  malignant  prostate  cancer,  and  Library  D  is  from  a  control 
library  prepared  from  a  pool  of  brain,  liver,  and  spleen  tissue .  The  Gene  index  contains  the  UniGene  Cluster  Identifier, 
and  Gene  Description  lists  the  gene  name.  In  each  box,  the  number  at  the  top  represents  the  fraction  of  sequences 
in  that  cDNA  library  that  expresses  the  gene  or  EST.  The  dot  is  a  visual  aid,  which  reflects  the  numerical  values.  Each 
library  is  compared  with  each  of  the  other  libraries  in  pairwise  analysis.  If  the  difference  in  gene  expression  between 
two  libraries  is  statistically  significant,  it  is  indicated  by  a  greater  than  or  less  than  symbol. 


normal  and  cancerous  tissue,  description  of  the 
methods  used  in  preparing  each  library,  and  infor¬ 
matics  tools  to  perform  analyses  of  gene  expression 
using  cDNA  library  data. 

A  goal  of  the  CGAP  is  to  facilitate  the  identifica¬ 
tion  of  possible  molecular  biomarkers  for  various 
types  of  cancer.  To  enable  investigators  to  analyze 
molecular  databases  that  are  very  large  and  com¬ 
plex,  CGAP  has  developed  software  tools  in  collab¬ 
oration  with  the  National  Center  for  Biotechnology 
Information  at  the  National  Institutes  of  Health. 
These  software  tools  aid  in  the  analysis  and  com¬ 
parison  of  gene  expression  in  a  variety  of  tissues 
and  stages  of  cancer.  All  of  these  tools  are  available 
on  the  CGAP  Internet  website. 

An  example  of  a  software  analysis  tool  is  Digital 
Differential  Display  (DDD).4  DDD  is  used  to  com¬ 
pare  sequence-based  gene  expression  profiles 
among  individual  cDNA  libraries  or  pools  of  librar¬ 
ies  from  the  same  or  different  tissues.  Analysis  of 
different  gene  expression  profiles  may  identify 
genes  that  contribute  to  a  cell’s  unique  character¬ 
istics.  Such  genes,  when  expressed  at  different  lev¬ 
els  in  normal  and  cancer  cells,  may  be  considered 
as  candidate  biomarkers  for  use  in  cancer  screen¬ 


ing.  DDD  uses  a  statistical  comparison  of  genes 
expressed  in  each  cDNA  library  to  determine 
which  differences  are  statistically  significant.  The 
statistical  analysis  is  based  on  the  Fisher  exact 
test.5  Differences  in  gene  expression  values  are  pre¬ 
sented  both  visually  and  numerically. 

An  example  of  a  DDD  analysis  of  three  cDNA 
libraries  made  from  prostate  tissue  is  shown  in  Fig¬ 
ure  1.  Row  1  shows  an  expression  profile  of  a  gene 
that  has  a  known  function,  whereas  rows  2  to  5 
show  expression  differences  between  genes  of  un¬ 
known  function,  referred  to  as  ESTs.  Row  1,  col¬ 
umn  D,  shows  that  all  three  prostate  cDNA  librar¬ 
ies  have  increased  expression  of  the  DIOl  gene 
compared  with  that  of  control.  Within  the  prostate 
libraries,  column  A  shows  increased  gene  expres¬ 
sion  compared  with  that  of  the  precancerous  li¬ 
brary  in  column  B,  but  was  not  shown  to  be  statis¬ 
tically  significantly  increased  when  compared  with 
malignant  prostate  cancer  tissue  libraries.  The 
power  of  this  analysis  is  in  identification  of  possi¬ 
ble  biomarkers  within  anonymous  EST  sequences. 
In  row  2,  the  expression  of  this  EST  is  increased 
over  control  in  only  the  precancerous  prostate 
cDNA  library,  whereas  the  EST  in  rows  3  and  4  is 
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increased  only  in  cancerous  prostate  tissues.  These 
genes  could  be  evaluated  as  candidate  biomarkers 
to  identify  prostate  cancer  disease  progression.  The 
Prostate  Expression  Database  (PEDB)  (http://www. 
mbt.washington.edu/PEDB)6  is  another  online  re¬ 
source  of  prostate  genetic  information.  The  PEDB 
is  a  curated  relational  database  and  suite  of  analysis 
tools  designed  specifically  for  the  study  of  prostate 
gene  expression  in  normal  and  diseased  states. 
The  ESTs,  derived  from  more  than  40  human 
prostate  cDNA  libraries,  are  assembled  into  dis¬ 
tinct  species  groups  that  are  annotated  with  in¬ 
formation  from  the  GenBank,  dbEST,  and  Uni¬ 
gene  public  sequence  databases.  The  expression 
pattern  of  each  gene  can  be  viewed  across  all 
libraries  or  tissues  using  the  Virtual  Expression 
Analysis  Tool  (VEAT),  a  graphical  user  interface 
written  in  Java  for  intra-  and  interlibrary  gene 
expression  analyses. 

cDNA  EXPRESSION  ARRAYS  FOR 
BIOMARKER  IDENTIFICATION 

The  inherent  heterogeneity  of  prostate  cancers 
and  the  diversity  of  therapeutic  interventions  sug¬ 
gest  that  it  is  unlikely  that  a  single  biomarker  or 
intermediate  endpoint  that  will  provide  sufficient 
sensitivity  or  specificity  for  assessing  a  treatment 
response  can  be  identified.  Efforts  have  been  di¬ 
rected  toward  methods  of  simultaneously  measur¬ 
ing  multiple  biomarkers  at  the  DNA,  RNA,  or  pro¬ 
tein  levels.  Such  a  multiplexed  approach  will 
greatly  expand  the  information  gained  from  each 
patient  sample  and  clinical  trial.  In  addition,  pat¬ 
terns  in  biomarker  data  may  be  identified  that  to¬ 
gether  exceed  the  sum  of  individual  measure¬ 
ments. 

Recent  developments  in  informatics,  miniatur¬ 
ization,  and  robotics  have  provided  new  extremely 
powerful  approaches  for  comprehensive  measure¬ 
ments  of  genetic  alterations  that  occur  in  neopla¬ 
sia.  These  measured  alterations  could  also  reflect  a 
response  (or  lack  of  response)  to  a  chemopreven- 
tive  or  therapeutic  agent.  One  such  comprehensive 
approach  involves  the  use  of  DNA  arrays,  a  tech¬ 
nique  that  combines  the  proven  chemistry  of  nu¬ 
cleic  acid  hybridization  with  advanced  automation 
and  imaging  technology  to  quantitatively  detect 
changes  in  the  expression  levels  of  thousands  of 
genes  simultaneously.  DNA  arrays  have  been  as¬ 
sembled  in  several  configurations,  including  oligo¬ 
nucleotide  arrays,7  microarrays  of  cDNA  spotted 
on  glass  slides,8  and  DNA  spotted  onto  nylon 
membranes.9  The  basic  method  is  straightforward: 
DNA  representing  a  particular  gene  of  interest  is 
either  spotted  (printed)  or  synthesized  onto  a  solid 
support,  such  as  a  glass  microscope  slide,  silicon 
wafer,  or  nylon  membrane  (Figure  2).  The  proce¬ 


dure  is  repeated  in  an  automated  fashion  with 
thousands  of  different  genes,  such  that  each  is  de¬ 
posited  in  a  precise  spatial  location  that  allows  for 
the  subsequent  identification  of  any  individual 
spot.  Probes  representing  the  expressed  genetic  in¬ 
formation  in  a  tissue  sample  are  labeled  with  radio¬ 
active  or  fluorescent  markers  that  can  be  quantified 
by  sensitive  detectors  and  used  for  comparative 
analyses.  A  limitation  on  the  number  of  individual 
elements  that  can  be  placed  on  the  area  of  a  given 
“chip”  array  places  a  premium  on  efficient  con¬ 
struction.  This  is  accomplished  by  eliminating  re¬ 
dundancy  (maximizing  diversity),  and  incorporat¬ 
ing  DNA  sequences  that  are  relevant  for  the 
biological  system  under  study. 

Gene  expression  catalogs,  such  as  the  CGAP  and 
the  PEDB,610  can  be  exploited  for  the  construction 
and  analysis  of  cDNA  expression  arrays  by  provid¬ 
ing  a  virtual  archive  of  thousands  of  genes  ex¬ 
pressed  in  prostate  tissue.  Coupling  this  virtual  re¬ 
pository  with  the  physical  clones  representing  the 
corresponding  DNA  molecules  allows  for  the  con¬ 
struction  of  comprehensive  arrays.  The  continued 
expansion  of  this  resource  to  encompass  all  pros¬ 
tate  transcripts  will  allow  for  the  simultaneous 
analysis  of  all  genes  expressed  in  normal  and  neo¬ 
plastic  prostate  cells.  This  effort  will  require  exten¬ 
sive  testing  on  prostate  tumors  and  a  further  refine¬ 
ment  of  the  methods  to  include  statistical  measures 
of  biological  and  experimental  variance. 

STATISTICAL  ISSUES  IN  THE  ANALYSIS  OF 
cDNA  MICROARRAYS 

Special-purpose,  tissue-specific  cDNA  microar¬ 
rays  can  now  be  routinely  generated  using  com¬ 
mercially  available  spotting  robots,  either  using 
glass-based  or  nylon-based  substrate.  A  growing 
number  of  commercial  cDNA  microarrays  are  also 
available,  giving  smaller  labs  the  opportunity  to 
use  this  technology.  Careful  attention  to  the  design 
and  statistical  analysis  of  each  experiment  is  essen¬ 
tial,  especially  given  the  high  cost  of  microarrays. 
As  with  any  assay  procedure,  microarray  data  are 
subject  to  three  major  sources  of  random  and  sys¬ 
tematic  error:  reagent  quality,  sample  preparation, 
and  laboratory  technique.  The  most  important  re¬ 
agent  is  the  microarray  itself,  which  may  be  subject 
to  significant  batch-to-batch  variability.  The  array 
may  include  clones  of  questionable  quality,  possi¬ 
bly  including  troublesome  repetitive  DNA  or  an¬ 
other  contaminating  sequence.  Variability  of  the 
substrate,  either  nylon  or  glass,  can  have  a  marked 
effect  on  the  uniformity  of  the  array  image.  Sample 
preparation  includes  all  tissue  handling,  cell  isola¬ 
tion,  RNA  extraction,  and  labeling  steps.  Unin¬ 
tended  variation  in  RNA  content  may  easily  result 
from  poor  temperature  control,  heat  shock,  degra- 
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FIGURE  2.  cDNA  microorray  construction  and  analysis.  Microarray  assays  are  performed  in  a  multistep  process. 
First,  microarrays  are  prepared  by  assembling  sets  of  cDNA  clones  in  96-  or  384-well  microtiter  plates.  Small 
tweezer  tips  or  needles  attached  to  a  robotic  arm  are  used  to  withdraw  small  amounts  of  the  DNA  solution  from  the 
microtiter  plates  and  print  them  onto  glass  microscope  slides  in  a  precise  spatial  orientation  with  high  replicative 
fidelity.  cDNA  probes  are  prepared  from  two  distinct  tissue  sources  (eg,  normal  tissue  and  neoplastic  tissue)  by  first 
extracting  RNA  followed  by  a  conversion  step  to  cDNA  that  incorporates  a  different  fluorescent  dye  into  the  different 
tissue  source  cDNA  (eg,  green  for  normal  and  red  for  neoplastic).  These  labeled  cDNA  probes  are  then  combined  and 
hybridized  to  the  microarray  such  that  cDNAs  in  the  probe  will  attach  to  their  complementary  cDNA  spot  on  the 
microarray  surface .  Nonhybridizing  cDNAs  are  removed  by  a  washing  step,  and  the  remaining  bound  cDNA 
molecules  are  quantitated  by  measuring  the  fluorescent  intensity  at  every  spot  location.  Array  analyses  determine 
the  ratio  of  intensities  at  each  spot  and  thus  identify  specific  genes  that  are  overexpressed  in  normal  tissue  relative 
to  neoplastic  (green  spot),  overexpressed  in  neoplastic  relative  to  normal  (red  spot),  or  expressed  at  equivalent 
levels  (yellow  spot). 


dation,  sample  handling,  etc.  Fluorescence  or  ra¬ 
dioactive  label  incorporation  may  also  be  subject  to 
variation  and  can  strongly  influence  the  results. 
During  the  hybridization  of  labeled  probe  to  the  tar¬ 
get  cDNA  on  the  array,  carefully  controlled  time, 
temperature,  and  agitation  conditions  should  prevail. 
Issues  of  saturation  and  dynamic  range  compression 
may  arise  during  image  acquisition  and  storage. 

By  far  the  most  straightforward  way  to  address 
each  of  these  issues  is  by  use  of  independent  repli¬ 
cated  experiments.  Apparent  gene  expression 
changes  that  persist  through  such  repeated  exper¬ 
iments  can  correctly  be  ascribed  to  interesting  bi¬ 
ological  changes  rather  than  artifacts  of  the  assay 
itself.  The  following  illustrative  analysis  of  dupli¬ 
cate  experiments  easily  screens  out  many  artifac- 


tual  expression  changes.  We  compared  a  mela¬ 
noma  cell  line  to  a  prostate  tumor  cell  line  for 
expression  differences  on  a  prostate-specific,  ny¬ 
lon-based  cDNA  array.11  Spot  intensities  were 
quantified  using  the  P-SCAN  software  (available  at 
http://abs.cit.nih.gov/PSCAN).  The  intensities  of 
each  spot  were  compared  in  Figure  3A,  which  at 
first  seems  to  indicate  that  a  large  number  of  genes 
have  greater  than  fourfold  changes  in  relative  ex¬ 
pression  levels  between  the  two  cell  types.  Analysis 
of  a  duplicate  experiment  gave  a  different  picture. 
Figure  3B  shows  that  a  much  smaller  number  of 
genes  undergo  greater  than  fourfold  changes  con¬ 
sistently  in  both  experiments,  meaning  that  many 
of  the  apparent  fourfold  changes  in  the  first  exper¬ 
iment  were  “false-positives.”  A  family  of  differen- 
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FIGURE  3.  Comparison  of  melanoma  (M)  expression  levels  to  that  from  a  prostate  tumor  cell  line  (P).11  (A] 
Normalized  intensities  show  more  than  148  genes  with  apparent  expression  change  over  four-fold  (up,  squares  or 
down,  triangles ).  (B)  Expression  ratios  are  compared  for  duplicate  experiments  (PI  /Ml,  P2/M2).  Only  3 1  genes  are 
consistently  over-  or  underexpressed  by  greater  than  fourfold  in  both  experiments. 
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FIGURE  4.  Molecular  profiles  as  surrogate  endpoint  biomarkers.  Patients  with  localized  prostate  cancer  are 
treated  with  a  chemopreventive  (CP]  agent  A  biopsy  is  performed  and  subjected  to  molecular  profiling  by  cDNA 
microarray  analysis.  The  pattern  of  expression  is  compared  with  reference  patterns  previously  shown  to  correlate 
with  a  tumor  response  or  lack  of  tumor  response  to  the  CP  agent.  These  data  are  used  to  guide  further  therapeutic 
intervention. 


tially  expressed  genes  also  clearly  emerged  and  was  APPLICATION  TO  CHEMOPREVENTION 
later  confirmed  by  Northern  blot  analysis.  Reduc¬ 
tion  in  the  number  of  false-positives  can  be  essen-  Among  their  many  applications,  database  and  ar- 
tial  when  using  microarray  technology  to  look  for  ray-based  methods  of  genetic  analysis  can  be  useful 
new  cancer  markers,  as  tens  of  thousands  of  clones  for  the  identification,  acquisition,  and  assessment 
must  be  screened.  of  candidate  molecular  markers  that  could  be 


158 


UROLOGY  57  (Supplement  4A),  April  2001 


adopted  as  surrogate  endpoints  for  assessing  pre¬ 
ventive  strategies  (chemoprevention  or  nutritional 
intervention).  One  scenario  involves  a  cohort  of 
patients  diagnosed  with  low-  or  intermediate- 
grade  prostate  cancers  by  needle  biopsy.  Patients 
who  elect  to  forgo  primary  therapy  (radical  prosta¬ 
tectomy  or  radiotherapy)  could  be  offered  a  che- 
mopreventive  agent  aimed  at  halting  cancer  pro¬ 
gression.  Gene  expression  profiles  of  tumor  tissue 
before  and  after  the  chemopreventive  agent  would 
be  assessed  for  expression  patterns  correlating 
with  a  propensity  for  the  cancer  to  progress,  indi¬ 
cating  that  a  primary  therapy  should  be  offered,  or 
for  the  cancer  to  respond  to  the  chemopreventive 
agent  and  thus  require  no  further  intervention 
(Figure  4).  The  development  of  this  type  of  assay  is 
clearly  desirable,  but  defining  predictive  patterns 
of  expression  is  not  a  trivial  task. 
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